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DETAILED ACTION 

1. Claims 1-22 have been examined. 

Papers Submitted 

2. It is hereby acknowledged that the following papers have been received and placed of 
record in the file: Amendment as received on 19 April 2005. 

Claim Rejections - 35 USC §103 

3. The following is a quotation of 35 U.S.C 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

4. Claims 1-22 are rejected under 35 U.S.C. 103(a) as being unpatentable over Greenley, 
U.S. Patent No. 5,761,469 (herein referred to as Greenley) in view of Zaidi, U.S. Patent Number 
5,619,668 (herein referred to as Zaidi). 

5. Regarding claims 1 and 14, taking claim 14 as exemplary, Greenley has taught a 
processing system comprising: 

a. A data processor (Greenley 100 of Fig, 1), 

b. A memory coupled to said data processor (Greenley Col. 1 lines 41-43). 

c. Wherein said data processor comprises: 

i. An instruction execution pipeline comprising N processing stages, each of 
said N processing stages capable of performing one of a plurality of 
execution steps associated with a pending instruction being executed by 
said instruction execution pipeline (Greenley Col.l Hnes 34-40). 
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ii. A data cache (Greenley 180 of Fig, 1) capable of storing data values used 
by said pending instruction (Greenley Col. 1 lines 42-43). 

iii. A plurality of registers (150 of Fig. 1) capable of receiving said data values 
from said data cache (Greenley Col. 1 lines 41-45). 

iv. A load store unit (Greenley 130 of Fig. 1) capable of transferring a first one 
of said data values from said data cache to a target one of said plurality of 
registers during execution of a load operation (Greenley Col l lines 15-21, 
63-67 and Col.2 lines 1-7, 13-15). 

v. A shifter circuit (Greenley 160, 170 of Fig. 1) associated with said load 
store unit capable of one of a) shifting (Greenley Col.2 lines 19-31), b) 
sign extending (Greenley Col.2 lines 48-54), and c) zero extending 
(Greenley Col.2 lines 45-47) said first data value prior to loading said first 
data value into said target register. 

6. Greenley has not explicitly taught 

a. A plurality of memory-mapped peripheral circuits coupled to said data processor 
for performing selected functions in association with said data processor; and 

b. Bypass circuitry associated with said load store unit capable of transferring said 
first data value from said data cache directly to said target register without 
processing said first data value in said shifter circuit. 

7. However, Greenley has taught a sign extension unit (Greenley 160 of Fig. 1) that fills in 
unoccupied bits of a register by extending its sign after it is loaded from the data cache (Greenley 
Col.2 lines 48-50). Zaidi has taught this as well (Zaidi column 8, lines 6-52) and 
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a. A plurality of memory-mapped peripheral circuits coupled to said data processor 
for performing selected functions in association with said data processor (Zaidi 
column 1, hne 63 to column 2, line 10); and 

b. Bypass circuitry associated with said load store unit capable of transferring said 
first data value from said data cache directly to said target register without 
processing said first data value in said shifter circuit (Zaidi column 5, lines 17-47; 
column 8, line 53 to column 9, line 5; column 9, Une 57 to column 10, line 3; 
Figure 2; and Figure 5). 

8. A person of ordinary skill in the art at the time the invention was made would have 
recognized, and as taught by Zaidi, the bypass increases the speed of microprocessors (Zaidi 
column 2, lines 33-34), by eliminating the time required to read data from the registers and 
aligning the data, and reduces the need to stall pipelines (Zaidi column 2, lines 35-37), by 
eliminating the need for the pipeline to stall, i.e. delay execution, until the data is ready to be 
read from the registers and aUgned properly. Therefore, it would have been obvious to a person 
of ordinary skill in the art at the time the invention was made to incorporate the bypass of Zaidi 
in the device of Greenley to increase processor speed. 

9. Claim 1 is nearly identical to claim 14. Claim 1 differs in its lack of a main memory and 
memory-mapped peripheral circuits, but comprises the same data processor as claim 14, and is 
therefore rejected for the same reasons. 

10. Regarding claims 2 and 15, taking claim* 15 as exemplary, Greenley in view of Zaidi has 
taught the processing system as set forth in claim 14, wherein said bypass circuitry transfers said 
first data value from said data cache directly to said target register during a load word operation 
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(see above rejection of claim 1, as well as Zaidi column 5, lines 17-47; column 8, line 53 to 
column 9, line 5; column 9, line 57 to column 10, line 3; Figure 2; and Figure 5). While Greenley 
has taught a different register size than the applicant (Greenley Col.2 lines 17-19), the situation 
when a register has no unoccupied bits after being loaded with data from a data cache remains 
the same, with the size of the register and word being moot. Therefore Greenley' s loading of a 
double word has the same consequences as the applicant's loading of a word. 

11. Claim 2 is nearly identical to claim. 15. Claim 2 differs in its parent claim, but comprises 
the same data processor as claim 15, and is therefore rejected for the same reasons. 

12. Regarding claims 3 and 16, taking claim 16 as exemplary, Greenley in view of Zaidi has 
taught the data processor as set forth in claim 15, wherein said bypass circuitry (Zaidi column 5, 
lines 17-47; column 8, line 53 to column 9, Une 5; column 9, line 57 to column 10, line 3; Figure 
2; and Figure 5) transfers said first data value from said data cache directly to said target register 
at the end of two machine cycles (Greenley Col. 4 lines 17-20). 

13. Claim 3 is nearly identical to claim 16. Claim 3 differs in its parent claim, but comprises 
the same data processor as claim 16, and is therefore rejected for the same reasons. 

14. Regarding claims 4 and 17, taking claim 17 as exemplary, Greenley in view of Zaidi has 
taught the data processor as set forth in claim 14, wherein said shifter circuit one of a) shifts, b) 
sign extends, or c) zero extends said first data value prior to loading said first data value into said 
target register during a load half-word operation (see Greenley Col.2 lines 17-20, 24-30, 46-47). 

15. Claim 4 is nearly identical to claim 17. Claim 4 differs in its parent claim, but comprises 
the same data processor as claim 17, and is therefore rejected for the same reasons. 
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16. Regarding claims 5 and 18, taking claim 18 as exemplary, Greenley in view of Zaidi has 
taught the data processor as set forth in claim 17, wherein said shifter circuit loads said shifted 
first data value into said target register at the end of two machine cycles (Greenley Col. 4 lines 
17-20), but has not explicitly taught the load taking three machine cycles. However, Greenley 
has taught this two-cycle latency for all load instructions, including those without the need for 
sign extension. In the situation where the load instruction fills the target register completely and 
no sign extension is needed, such as when the data is already properly aligned since it is coming 
from the data cache, which only contains aligned data, or from the ALU, which outputs aligned 
data, the data processor as configured above will execute the load instruction at least one cycle 
faster due to the elimination of the alignment operations (Zaidi column 5, Unes 17-47; column 8, 
line 53 to column 9, line 5; column 9, line 57 to column 10, line 3; Figure 2; and Figure 5). This 
will create a latency of at least one cycle fewer for those load instructions which bypass the sign 
extension unit, and at least one more cycle for those which need sign extension, such as half- 
word load instructions. Because the applicant's claimed load instructions, which take 2 and 3 
cycles for bypassed and sign-extended instructions, respectively, have no claimed advantages 
over the 1 and 2 cycles that Greenly in view of Zaidi have taught, but are merely a change in the 
magnitude of latency, they are considered to be equivalent and thus taught by Greenley in view 
of Zaidi (see In re Rose, 220 F.2d 459, 463, 105 USPQ 237, 240 (CCPA 1955)). 

17. Claim 5 is nearly identical to claim 18. Claim 5 differs in its parent claim, but comprises 
the same data processor as claim 18, and is therefore rejected for the same reasons. 

18. Regarding claims 6 and 19, taking claim 19 as exemplary, Greenley in view of Zaidi has 
taught the data processor as set forth in claim 14, wherein said shifter circuit one of a) shifts, b) 
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sign extends, and c) zero extends said first data value prior to loading said first data value into 
said target register during a load byte operation (see Greenley Col. 2 lines 17-20, 36-40, 46-47). 

19. Claim 6 is nearly identical to claim 19. Claim 6 differs in its parent claim, but comprises 
the same data processor as claim 19, and is therefore rejected for the same reasons. 

20. Regarding claims 7 and 20, taking claim 20 as exemplary, Greenley in view of Zaidi has 
taught the data processor as set forth in claim 6, wherein said shifter circuit loads said shifted 
first data value into said target register at the end of two machine cycles (Greenley Col. 4 lines 
17-20), but has not explicitly taught the transfer taking three machine cycles. However, 
Greenley has taught this two-cycle latency for all load instructions, including those without the 
need for sign extension. In the situation where the load instruction fills the target register 
completely and no sign extension is needed, such as when the data is already properly aligned 
since it is coming from the data cache, which only contains aligned data, or from the ALU, 
which outputs aligned data, the data processor as configured above will execute the load 
instruction at least one cycle faster due to the elimination of the alignment operations (Zaidi 
column 5, lines 17-47; column 8, line 53 to column 9, line 5; column 9, line 57 to column 10, 
line 3; Figure 2; and Figure 5). This will create a latency of at least one cycle fewer for those 
load instructions which bypass the sign extension unit, and at least one more cycle for those 
which need sign extension, such as half-word load instructions. Because the appHcant's claimed 
load instructions, which take 2 and 3 cycles for bypassed and sign-extended instructions, 
respectively, have no claimed advantages over the 1 and 2 cycles that Greenly in view of Zaidi 
have taught, but are merely a change in the magnitude of latency, they are considered to be 
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equivalent and thus taught by Greenley in view of Zaidi (see In re Rose, 220 F.2d 459, 463, 105 
USPQ 237, 240 (CCPA 1955)). 

21 . Claim 7 is nearly identical to claim 20. Claim 7 differs in its parent claim, but comprises 
the same data processor as claim 20, and is therefore rejected for the same reasons. 

22. Regarding claims 8, 9, 21, and 22, taking claims 21 and 22 as exemplary, Greenley in 
view of Zaidi has taught the data processor as set forth in claim 14, but Greenley has not 
explicitly taught 

a. Wherein said bypass circuitry comprises a multiplexer having a first input channel 
coupled to a data output of said data cache; and 

b. Wherein said multiplexer has a second input channel coupled to an output of said 
shifter circuit. 

23. However, Greenley has taught a sign extension unit (Greenley 160 of Fig. 1) that fills in 
unoccupied bits of a register by extending its sign after it is loaded from the data cache (Greenley 
Col.2 lines 48-50), Zaidi has taught this as well (Zaidi column 8, lines 6-52) and 

a. Wherein said bypass circuitry comprises a multiplexer having a first input channel 
coupled to a data output of said data cache (Zaidi column 5, lines 17-47; column 
8, Hne 53 to column 9, line 5; column 9, line 57 to column 10, line 3; Figure 2; 
and Figure 5); and 

b. Wherein said multiplexer has a second input channel coupled to an output of said 
shifter circuit (Zaidi column 5, lines 17-47; column 8, line 53 to column 9, line 5; 
column 9, line 57 to column 10, line 3; Figure 2; and Figure 5). 
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24. A person of ordinary skill in the art at the time the invention was made would have 
recognized, and as taught by Zaidi, the bypass increases the speed of microprocessors (Zaidi 
column 2, lines 33-34), by eliminating the time required to read data from the registers and 
ahgning the data, and reduces the need to stall pipelines (Zaidi column 2, lines 35-37), by 
eliminating the need for the pipeline to stall, i.e. delay execution, until the data is ready to be 
read from the registers and aligned properly. Therefore, it would have been obvious to a person 
of ordinary skill in the art at the time the invention was made to incorporate the bypass of Zaidi 
in the device of Greenley to increase processor speed. 

25. Claims 8 and 9 are nearly identical to claims 21 and 22 respectively. Claims 8 and 9 
differ in its parent claim, but comprises the same data processor as claims 21 and 22, and is 
therefore rejected for the same reasons. 

26. Regarding claim 10, Greenley has taught for use in a processor comprising an N-stage 
execution pipeline (Greenley Col.l lines 34-40), a data cache (Greenley 180 of Fig. 1), and a 
plurality of registers (Greenley 150 of Fig. 1), a method of loading a first data value from the data 
cache into a target one of the registers, the method comprising the steps of 

a. Determining if a pending instruction in the execution pipeline is one of a load 
word operation, a load half-word operation, and a load byte operation (Greenley 
Col.l lines 63-67, Col.2 lines 1-7, 17-19 and Col.5 lines 13-24). 

b. In response to a determination that the pending instruction is a load half-word 
operation, transferring the first data value from the data cache to a shifter circuit 
and shifting the first data value prior to loading the first data value into the target 
register (Greenley Col.2 lines 24-3 1). 
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c. In response to a determination that the pending instruction is a load byte 

operation, transferring the first data value from the data cache to a shifter circuit 
and shifting the first data value prior to loading the first data value into the target 
register (Greenley Col. 2 lines 35-40). 
27. Greenley has not exphcitly taught where in response to a determination that the pending 
instruction is a load word operation, transferring the first data value from the data cache directly 
to the target register without processing the first data value in the shifter circuit. However, 
Greenley has taught a sign extension unit (Greenley 160 of Fig. 1) that fills in unoccupied bits of 
a register by extending its sign after it is loaded from the data cache (Greenley Col. 2 lines 48- 
50). Zaidi has taught this as well (Zaidi column 8, lines 6-52) and where in response to a 
determination that the pending instruction is a load word operation, transferring the first data 
value from the data cache directly to the target register without processing the first data value in 
the shifter circuit (Zaidi column 5, lines 17-47; column 8, line 53 to column 9, line 5; column 9, 
line 57 to column 10, line 3; Figure 2; and Figure 5). A person of ordinary skill in the art at the 
time the invention was made would have recognized, and as taught by Zaidi, the bypass 
increases the speed of microprocessors (Zaidi column 2, lines 33-34), by eliminating the time 
required to read data from the registers and aligning the data, and reduces the need to stall 
pipelines (Zaidi column 2, lines 35-37), by eliminating the need for the pipeline to stall, i.e. delay 
execution, until the data is ready to be read from the registers and aligned properly. Therefore, it 
would have been obvious to a person of ordinary skill in the art at the time the invention was 
made to incorporate the bypass of Zaidi in the device of Greenley to increase processor speed. 
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28. Regarding claim 11, Greenley in view of Zaidi has taught the method as set forth in claim 
10, wherein the step of transferring the first data value requires two machine cycles during a load 
word operation (Greenley Col. 4 lines 17-20). While Greenley has taught a different register size 
than the applicant (Greenley Col.2 lines 17-19), the situation when a register has no unoccupied 
bits after being loaded with data from a data cache remains the same, with the size of the register 
and word being moot. Therefore Greenley' s loading of a double word has the same 
consequences as the applicant's loading of a word {In re Rose, 220 F.2d 459, 463, 105 USPQ 
237, 240 (CCPA 1955)). 

29. Regarding claim 12, Greenley in view of Zaidi has taught the method as set forth in claim 
10, wherein the step of transferring the first data value requires two machine cycles during a load 
half-word operation (Greenley Col4 lines 17-20), but has not explicitly taught the transfer taking 
three machine cycles. However, Greenley has taught this two-cycle latency for all load 
instructions, including those without the need for sign extension. In the situation where the load 
instruction fills the target register completely and no sign extension is needed, such as when the 
data is already properly aligned since it is coming from the data cache, which only contains 
aUgned data, or from the ALU, which outputs aligned data, the data processor as configured 
above will execute the load instruction at least one cycle faster due to the elimination of the 
alignment operations (Zaidi column 5, Unes 17-47; column 8, line 53 to column 9, line 5; column 
9, line' 57 to column 10, line 3; Figure 2; and Figure 5). This will create a latency of at least one 
cycle fewer for those load instructions which bypass the sign extension unit, and at least one 
more cycle for those which need sign extension, such as half-word load instructions. Because 
the applicant's claimed load instructions, which take 2 and 3 cycles for bypassed and sign- 
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extended instructions, respectively, have no claimed advantages over the 1 and 2 cycles that 
Greenly in view of Zaidi have taught, but are merely a change in the magnitude of latency, they 
are considered to be equivalent and thus taught by Greenley in view of Zaidi (see In re Rose^ 220 
F.2d 459, 463, 105 USPQ 237, 240 (CCPA 1955)). 

30. Regarding claim 13, Greenley in view of Zaidi has taught the method as set forth in claim 
10 wherein the step of transferring the first data value requires two machine cycles during a load 
byte operation (Greenley Col. 4 lines 17-20), but has not explicitly taught the transfer taking three 
machine cycles. However, Greenley has taught this two-cycle latency for all load instructions, 
including those without the need for sign extension. In the situation where the load instruction 
fills the target register completely and no sign extension is needed, such as when the data is 
already properly aligned since it is coming from the data cache, which only contains aligned 
data, or from the ALU, which outputs aUgned data, the data processor as configured above will 
execute the load instruction at least one cycle faster due to the elimination of the aUgnment 
operations (Zaidi column 5, lines 17-47; column 8, line .53 to column 9, line 5; column 9, line 57 
to column 10, line 3; Figure 2; and Figure 5). This will create a latency of at least one cycle 
fewer for those load instructions which bypass the sign extension unit, and at least one more 
cycle for those which need sign extension, such as half-word load instructions. Because the 
applicant's claimed load instructions, which take 2 and 3 cycles for bypassed and sign-extended 
instructions, respectively, have no claimed advantages over the 1 and 2 cycles that Greenly in 
view of Zaidi have taught, but are merely a change in the magnitude of latency, they are 
considered to be equivalent and thus taught by Greenley in view of Zaidi (see In re Rose, 220 
F.2d 459, 463, 105 USPQ 237, 240 (CCPA 1955)). 
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Response to Arguments 

3 1 . Applicant's arguments filed 19 April 2005 have been fully considered but they are not 
persuasive. 

32. Applicant argues in essence on pages 10-14 

The bypass mechanism of Zaidi simply takes data fi*om a "source other than [a] 
register file" and provides the data as input to an arithmetic logic unit ("ALU") 
12. {Abstract). The bypass mechanism of Zaidi never transfers data from a data 
cache "directly to [a] target register." In fact, the bypass mechanism of Zaidi is 
specifically designed to bypass a register file 13, not to provide data from a data - 
cache "directly to" the register file 13. 

33. This has not been found persuasive. The claim limitations state, for example in claim 1, 
"bypass circuitry associated with sad load store unit capable of transferring said first data value 
from said data cache directly to said target register without processing said first data value in said 
shifter circuit." The claim limitations require only that a circuit transfer data from the data cache 
to the register file without having to go through a shifter. Zaidi shows in Figures 2 and 5 a direct 
connecfion, M, from the data cache 14 to the register file 13. The Examiner cited column 5, lines 
17-47; column 8, line 53 to column 9, line 5; column 9, line 57 to column 10, line 3; Figure 2; 
and Figure 5 in the rejection to show that the shifting of data occurs in the multiplexers 19 and 

1 8 in Figure 2 and 5 1 e and 5 1 e in Figure 5 . While Zaidi does store the data from the ALU 
and/or data cache into the bypass latch 16 in Figure 2, Zaidi also directly stores the data from the 
data cache into the target register in the register file in 13 of Figure 2. 
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34. Applicant argues in essence on page 12 . . As a result, the Office Action provides a 
motivation to avoid writing data to a register, which teaches away from the claims 'bypass 
circuitry' ..." This has not been found persuasive. The portion of the Examiner's motivation 
believed to be argued is "by eliminating the time required to read data from the registers and 
aligning the data, and reduces the need to stall pipelines (Zaidi column 2, lines 35-37)." The 
Examiner meant by this statement that it eliminates the time required to both read data and align 
the data from the registers, and reduces the need to stall the pipelines. The Examiner was 
referring to both actions combined, not the individual action of reading data from the register. It 
is known by a person of ordinary skill in the art that a significant portion of time is needed to 
read and align data is dedicated to the data alignment portion of this task. So, by eUminating the 
need to align the data, the speed of the processor is increased. 

Conclusion . 

35. THIS ACTION IS MADE FINAL, AppUcant is reminded of the extension of time 
policy as set forth in 37 CFR 1. 136(a). 

36. A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within TWO 
MONTHS of the mailing date of this final action and the advisory action is not mailed until after 
the end of the THREE-MONTH shortened statutory period, then the shortened statutory period 
will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 

CFR 1 .136(a) will be calculated from the mailing date of the advisory action. In no event, 
however, will the statutory period for reply expire later than SIX MONTHS from the mailing 
date of this final action. 
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37. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Aimee J. Li whose telephone number is (571) 272-4169. The 
examiner can normally be reached on M-T 7:30am-5 :00pm. 

38. If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Eddie Chan can be reached on (571) 272-4162. The fax phone number for the 
organization where this application or proceeding is assigned is 703-872-9306. 

39. Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 



AJL 

Aimee J. Li 
24 June 2005 




